Bufr process by yuvraajnarula · Pull Request #177 · openclimatefix/graph_weather

yuvraajnarula · 2025-10-30T06:51:12Z

Pull Request

Description

This PR adds a new BUFR processor module to enable reading and decoding NOMADS BUFR files into the NNJA-AI-compatible Parquet format.

The processor is designed to:

Decode BUFR messages using ecCodes (via the Python bindings)
Convert decoded data to the NNJA-AI archive schema, enabling seamless integration with existing workflows
Support decoding of initial high-priority observation types:
ADPUPA (upper-air soundings)
CrIS and IASI hyperspectral soundings
Serve as a modular component, so it can later be split into a dedicated repo for broader operational use.
Fixes #170

How Has This Been Tested?

Added pytest folder test which:

Reads a BUFR file from the NNJA archive
Converts it to Parquet
Compares the output with a reference NNJA-AI Parquet file
Passes if schemas and values match exactly
Yes

If your changes affect data processing, have you plotted any changes? i.e. have you done a quick sanity check?

Yes

Checklist:

My code follows OCF's coding style guidelines
I have performed a self-review of my own code
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
I have checked my code and corrected any misspellings

for more information, see https://pre-commit.ci

yuvraajnarula · 2025-10-30T06:53:13Z

@jacobbieker, I tried finding some files for this, but they were heavy for testing. Would you mind if I showed you a small batch of those files as output, so that I could know if this matches your vision?

jacobbieker

There needs to be some more changes for this, but thanks for the nice first step on this.

As for the test files, I think a good compromise would be to have some integration tests that pull real, historical NNJA BUFR files, the corresponding NNJA-AI representation, does the processing and confirms they match everywhere. This can be marked with pytest.mark.skip to skip in GitHub CI, but then I can run it locally and see if they match exactly.

For this, I would also probably cut down the PR to a single one, just the ADPUPA for now. It makes it simpler to review and make sure the setup works correctly with the real files.

jacobbieker · 2025-10-30T08:44:25Z

+
+    source_name = "ADPUPA"
+
+    def _build_mappings(self):


Is that the only data in the NNJA-AI ADPUPA? I thought there were more variables

I just wrote the basic ones and actually wanted to ask you that while I was leaning towards the primary descriptors when I loaded conv-adpupa-NC002001 for reference.

jacobbieker · 2025-10-30T08:44:46Z

+    source_name = "CrIS"
+
+    def _build_mappings(self):
+        self.field_mappings = {


This should definitely have more I think?

jacobbieker · 2025-10-30T08:46:42Z

Why is this being moved and renamed? Seems unnecessary

for more information, see https://pre-commit.ci

yuvraajnarula and others added 9 commits October 16, 2025 16:20

feat: baseline processor init + fixes with nnjai

ce01aaf

feat : introduced schemas cris, adpupa pipelines for nnjaai mapping

0b5410d

feat : initial bufr processor with adpupa and cris support

b15b6e9

feat : bufr func checks

c3e2e4b

feat : initiated pytest for bufr_process

0f454f9

feat : pytest established, splitting left

a05426a

chore : distrbuted files for better readability

55e6ef1

Merge branch 'openclimatefix:main' into bufr_process

e494436

[pre-commit.ci] auto fixes from pre-commit.com hooks

ea055d9

for more information, see https://pre-commit.ci

jacobbieker requested changes Oct 30, 2025

View reviewed changes

yuvraajnarula and others added 5 commits November 4, 2025 21:13

chore : added more vars for adpupa

65ee1f8

chore : updated cris schema

62eae0e

[pre-commit.ci] auto fixes from pre-commit.com hooks

b76fd57

for more information, see https://pre-commit.ci

chore : minor changes as requested

71d9b69

[pre-commit.ci] auto fixes from pre-commit.com hooks

7517a36

for more information, see https://pre-commit.ci

yuvraajnarula marked this pull request as draft December 8, 2025 03:49

yuvraajnarula mentioned this pull request Dec 8, 2025

Add initial ADPUPA BUFR processor + integration test #183

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bufr process#177

Bufr process#177
yuvraajnarula wants to merge 14 commits intoopenclimatefix:mainfrom
yuvraajnarula:bufr_process

yuvraajnarula commented Oct 30, 2025

Uh oh!

yuvraajnarula commented Oct 30, 2025

Uh oh!

jacobbieker left a comment

Uh oh!

jacobbieker Oct 30, 2025

Uh oh!

yuvraajnarula Nov 1, 2025

Uh oh!

jacobbieker Oct 30, 2025

Uh oh!

Uh oh!

jacobbieker Oct 30, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

yuvraajnarula commented Oct 30, 2025

Pull Request

Description

How Has This Been Tested?

Checklist:

Uh oh!

yuvraajnarula commented Oct 30, 2025

Uh oh!

jacobbieker left a comment

Choose a reason for hiding this comment

Uh oh!

jacobbieker Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

yuvraajnarula Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

jacobbieker Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jacobbieker Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants